home *** CD-ROM | disk | FTP | other *** search
- This chronicle briefly describes the progress of agrep.
-
- Feb/91: The approximate pattern matching algorithm called 'bitap'
- (bit-parallel approximate pattern matching) is designed.
- The algorithm is a generalization of Baeza-Yates' "shift-or"
- algorithm for exact matching.
-
- Mar/91: Many extensions of the algorithm 'bitap' are found, especially
- for approximate regular expression pattern matching. Preliminary
- implementation of the algorithm showed a strong promise for
- a general-purpose fast approximate pattern-matching tool.
-
- Apr/91: Approximate regular expression pattern matching was implemented.
- The result is even better than expected.
- The design of the software tool is pinned down.
- (For example, record oriented, multi-pattern, AND/OR logic queries.)
- A partition technique for approximate pattern matching is used.
-
- May/91: The prototype of "agrep" is completed.
- A lot of debugging/optimization in this month.
-
- Jun/91: The first version of agrep is released.
- agrep 1.0 was announced and made available by anonymous ftp
- from cs.arizona.edu.
-
- Jul/91: A sub-linear expected-time algorithm, called "amonkey" for
- approximate pattern matching (for simple pattern) is designed.
- The algorithm has the same time complexity as that of
- Chang&Lawler but is much much faster in practice.
- The algorithm is based on a variation of Boyer-Moore technique,
- which we call "block-shifting."
- A sub-linear expected-time algorithm, called "mgrep" for
- matching a set of patterns is designed based on the "block-shifting"
- technique with a hashing technique.
-
- Aug/91: "amonkey" is implemented and incorporated into agrep.
- It is very fast for long patterns like DNA patterns.
- (But roughly the same for matching English words as the bitap
- algorithm using the partition technique.)
- Prototype of "mgrep" is implemented.
-
- Sep/91: "mgrep" is incorporated into agrep to support the -f option.
- An algorithm for approximate pattern matching that combines the
- 'partition' technique with the sub-linear expected-time algorithm
- for multi-patterns is designed.
- Implementation shows it to be the fastest for ASCII text (and pattern).
- Boyer-moore technique for exact matching is incorporated.
-
- Nov/91: The final paper of "agrep" that is to appear in USENIX
- conference (Jan 1992) is finished.
-
- Jan/92: Some new options are added, such as find best matches (-B),
- and file outputs (-G).
- The man pages are revised.
- agrep version 2.0 is released.
-
-